AITopics | langevin mcmc

Collaborating Authors

langevin mcmc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Neural Information Processing SystemsDec-23-2025, 22:33:46 GMT

We study the task of efficiently sampling from a Gibbs distribution $d \pi^* = e^{-h} d {\text{vol}}_g$ over a Riemannian manifold $M$ via (geometric) Langevin MCMC; this algorithm involves computing exponential maps in random Gaussian directions and is efficiently implementable in practice. The key to our analysis of Langevin MCMC is a bound on the discretization error of the geometric Euler-Murayama scheme, assuming $\nabla h$ is Lipschitz and $M$ has bounded sectional curvature. Our error bound matches the error of Euclidean Euler-Murayama in terms of its stepsize dependence. Combined with a contraction guarantee for the geometric Langevin Diffusion under Kendall-Cranston coupling, we prove that the Langevin MCMC iterates lie within $\epsilon$-Wasserstein distance of $\pi^*$ after $\tilde{O}(\epsilon^{-2})$ steps, which matches the iteration complexity for Euclidean Langevin MCMC. Our results apply in general settings where $h$ can be nonconvex and $M$ can have negative Ricci curvature. Under additional assumptions that the Riemannian curvature tensor has bounded derivatives, and that $\pi^*$ satisfies a $CD(\cdot,\infty)$ condition, we analyze the stochastic gradient version of Langevin MCMC, and bound its iteration complexity by $\tilde{O}(\epsilon^{-2})$ as well.

efficient sampling, langevin mcmc, riemannian manifold, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Neural Information Processing SystemsOct-3-2025, 02:36:47 GMT

Two recent papers, [Wang et al., 2020, Li et al., 2022] also look into a similar problem of bounding

artificial intelligence, machine learning, theorem 1, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.52)

Add feedback

CoCre-Sam (Kokkuri-san): Modeling Ouija Board as Collective Langevin Dynamics Sampling from Fused Language Models

Taniguchi, Tadahiro, Nagano, Masatoshi, Omoto, Haruumi, Hayashi, Yoshiki

arXiv.org Artificial IntelligenceJul-17-2025

Collective human activities like using an Ouija board (or Kokkuri-san) often produce emergent, coherent linguistic outputs unintended by any single participant. While psychological explanations such as the ideomotor effect exist, a computational understanding of how decentralized, implicit linguistic knowledge fuses through shared physical interaction remains elusive. We introduce CoCre-Sam (Collective-Creature Sampling), a framework modeling this phenomenon as collective Langevin dynamics sampling from implicitly fused language models. Each participant is represented as an agent associated with an energy landscape derived from an internal language model reflecting linguistic priors, and agents exert stochastic forces based on local energy gradients. We theoretically prove that the collective motion of the shared pointer (planchette) corresponds to Langevin MCMC sampling from the sum of individual energy landscapes, representing fused collective knowledge. Simulations validate that CoCre-Sam dynamics effectively fuse different models and generate meaningful character sequences, while ablation studies confirm the essential roles of collective interaction and stochasticity. Altogether, CoCre-Sam provides a novel computational mechanism linking individual implicit knowledge, embodied collective action, and emergent linguistic phenomena, grounding these complex interactions in the principles of probabilistic sampling.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.11906

Country:

Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.40)
Europe > United Kingdom > England > Berkshire > Reading (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

Inference-time Scaling of Diffusion Models through Classical Search

Zhang, Xiangcheng, Lin, Haowei, Ye, Haotian, Zou, James, Ma, Jianzhu, Liang, Yitao, Du, Yilun

arXiv.org Machine LearningMay-30-2025

Classical search algorithms have long underpinned modern artificial intelligence. In this work, we tackle the challenge of inference-time control in diffusion models -- adapting generated outputs to meet diverse test-time objectives -- using principles from classical search. We propose a general framework that orchestrates local and global search to efficiently navigate the generative space. It employs a theoretically grounded local search via annealed Langevin MCMC and performs compute-efficient global exploration using breadth-first and depth-first tree search. We evaluate our approach on a range of challenging domains, including planning, offline reinforcement learning, and image generation. Across all tasks, we observe significant gains in both performance and efficiency. These results show that classical search provides a principled and practical foundation for inference-time scaling in diffusion models. Project page at diffusion-inference-scaling.github.io.

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Machine Learning

2505.23614

Country: North America > United States (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Compact Bayesian Neural Networks via pruned MCMC sampling

Deo, Ratneel, Sisson, Scott, Webster, Jody M., Chandra, Rohitash

arXiv.org Artificial IntelligenceJan-12-2025

Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge. This is mainly due to the problem of sampling multimodal posterior distributions using Markov Chain Monte Carlo (MCMC) sampling and variational inference algorithms. Moreover, the number of model parameters scales exponentially with additional hidden layers, neurons, and features in the dataset. Typically, a significant portion of these densely connected parameters are redundant and pruning a neural network not only improves portability but also has the potential for better generalisation capabilities. In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models having removed redundant parameters. We sample the posterior distribution of model parameters (weights and biases) and prune weights with low importance, resulting in a compact model. We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling. We evaluate the effectiveness of our MCMC pruning strategy on selected benchmark datasets for regression and classification problems through empirical result analysis. We also consider two coral reef drill-core lithology classification datasets to test the robustness of the pruning model in complex real-world datasets. We further investigate if refining compact BNN can retain any loss of performance. Our results demonstrate the feasibility of training and pruning BNNs using MCMC whilst retaining generalisation performance with over 75% reduction in network size. This paves the way for developing compact BNN models that provide uncertainty estimates for real-world applications.

artificial intelligence, machine learning, pruning, (17 more...)

arXiv.org Artificial Intelligence

2501.06962

Country:

North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.93)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Neural Information Processing SystemsOct-10-2024, 10:06:36 GMT

We study the task of efficiently sampling from a Gibbs distribution d \pi * e {-h} d {\text{vol}}_g over a Riemannian manifold M via (geometric) Langevin MCMC; this algorithm involves computing exponential maps in random Gaussian directions and is efficiently implementable in practice. The key to our analysis of Langevin MCMC is a bound on the discretization error of the geometric Euler-Murayama scheme, assuming abla h is Lipschitz and M has bounded sectional curvature. Our error bound matches the error of Euclidean Euler-Murayama in terms of its stepsize dependence. Combined with a contraction guarantee for the geometric Langevin Diffusion under Kendall-Cranston coupling, we prove that the Langevin MCMC iterates lie within \epsilon -Wasserstein distance of \pi * after \tilde{O}(\epsilon {-2}) steps, which matches the iteration complexity for Euclidean Langevin MCMC. Our results apply in general settings where h can be nonconvex and M can have negative Ricci curvature.

efficient sampling, langevin mcmc, riemannian manifold, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Energy-Based Models For Speech Synthesis

Sun, Wanli, Tu, Zehai, Ragni, Anton

arXiv.org Artificial IntelligenceOct-19-2023

Recently there has been a lot of interest in non-autoregressive (non-AR) models for speech synthesis, such as FastSpeech 2 and diffusion models. Unlike AR models, these models do not have autoregressive dependencies among outputs which makes inference efficient. This paper expands the range of available non-AR models with another member called energy-based models (EBMs). The paper describes how noise contrastive estimation, which relies on the comparison between positive and negative samples, can be used to train EBMs. It proposes a number of strategies for generating effective negative samples, including using high-performing AR models. It also describes how sampling from EBMs can be performed using Langevin Markov Chain Monte-Carlo (MCMC). The use of Langevin MCMC enables to draw connections between EBMs and currently popular diffusion models. Experiments on LJSpeech dataset show that the proposed approach offers improvements over Tacotron 2.

ebm, hypothesis, tacotron 2, (12 more...)

arXiv.org Artificial Intelligence

2310.12765

Country: Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Chain of Log-Concave Markov Chains

Saremi, Saeed, Park, Ji Won, Bach, Francis

arXiv.org Machine LearningSep-28-2023

We introduce a theoretical framework for sampling from unnormalized densities based on a smoothing scheme that uses an isotropic Gaussian kernel with a single fixed noise scale. We prove one can decompose sampling from a density (minimal assumptions made on the density) into a sequence of sampling from log-concave conditional densities via accumulation of noisy measurements with equal noise levels. Our construction is unique in that it keeps track of a history of samples, making it non-Markovian as a whole, but it is lightweight algorithmically as the history only shows up in the form of a running empirical mean of samples. Our sampling algorithm generalizes walk-jump sampling (Saremi & Hyv\"arinen, 2019). The "walk" phase becomes a (non-Markovian) chain of (log-concave) Markov chains. The "jump" from the accumulated measurements is obtained by empirical Bayes. We study our sampling algorithm quantitatively using the 2-Wasserstein metric and compare it with various Langevin MCMC algorithms. We also report a remarkable capacity of our algorithm to "tunnel" between modes of a distribution.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2305.19473

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Add feedback

Protein Discovery with Discrete Walk-Jump Sampling

Frey, Nathan C., Berenberg, Daniel, Zadorozhny, Karina, Kleinhenz, Joseph, Lafrance-Vanasse, Julien, Hotzel, Isidro, Wu, Yan, Ra, Stephen, Bonneau, Richard, Cho, Kyunghyun, Loukas, Andreas, Gligorijevic, Vladimir, Saremi, Saeed

arXiv.org Artificial IntelligenceJun-8-2023

We resolve difficulties in training and sampling from a discrete generative model by learning a smoothed energy function, sampling from the smoothed data manifold with Langevin Markov chain Monte Carlo (MCMC), and projecting back to the true data manifold with one-step denoising. Our Discrete Walk-Jump Sampling formalism combines the maximum likelihood training of an energy-based model and improved sample quality of a score-based model, while simplifying training and sampling by requiring only a single noise level. We evaluate the robustness of our approach on generative modeling of antibody proteins and introduce the distributional conformity score to benchmark protein generative models. By optimizing and sampling from our models for the proposed distributional conformity score, 97-100% of generated samples are successfully expressed and purified and 35% of functional designs show equal or improved binding affinity compared to known functional antibodies on the first attempt in a single round of laboratory experiments. We also report the first demonstration of long-run fast-mixing MCMC chains where diverse antibody protein classes are visited in a single MCMC chain.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2306.1236

Country:

North America > United States > New York (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.97)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(3 more...)

Add feedback

Conservative objective models are a special kind of contrastive divergence-based energy model

Beckham, Christopher, Pal, Christopher

arXiv.org Artificial IntelligenceApr-7-2023

In this work we theoretically show that conservative objective models (COMs) for offline model-based optimisation (MBO) are a special kind of contrastive divergence-based energy model, one where the energy function represents both the unconditional probability of the input and the conditional probability of the reward variable. While the initial formulation only samples modes from its learned distribution, we propose a simple fix that replaces its gradient ascent sampler with a Langevin MCMC sampler. This gives rise to a special probabilistic model where the probability of sampling an input is proportional to its predicted reward. Lastly, we show that better samples can be obtained if the model is decoupled so that the unconditional and conditional probabilities are modelled separately.

artificial intelligence, conservative objective model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2304.03866

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Add feedback